44 research outputs found

    A comparison of homonym meaning frequency estimates derived from movie and television subtitles, free association, and explicit ratings

    Get PDF
    First Online: 10 September 2018Most words are ambiguous, with interpretation dependent on context. Advancing theories of ambiguity resolution is important for any general theory of language processing, and for resolving inconsistencies in observed ambiguity effects across experimental tasks. Focusing on homonyms (words such as bank with unrelated meanings EDGE OF A RIVER vs. FINANCIAL INSTITUTION), the present work advances theories and methods for estimating the relative frequency of their meanings, a factor that shapes observed ambiguity effects. We develop a new method for estimating meaning frequency based on the meaning of a homonym evoked in lines of movie and television subtitles according to human raters. We also replicate and extend a measure of meaning frequency derived from the classification of free associates. We evaluate the internal consistency of these measures, compare them to published estimates based on explicit ratings of each meaning’s frequency, and compare each set of norms in predicting performance in lexical and semantic decision mega-studies. All measures have high internal consistency and show agreement, but each is also associated with unique variance, which may be explained by integrating cognitive theories of memory with the demands of different experimental methodologies. To derive frequency estimates, we collected manual classifications of 533 homonyms over 50,000 lines of subtitles, and of 357 homonyms across over 5000 homonym–associate pairs. This database—publicly available at: www.blairarmstrong.net/homonymnorms/—constitutes a novel resource for computational cognitive modeling and computational linguistics, and we offer suggestions around good practices for its use in training and testing models on labeled data

    Method For Making 2-Electron Response Reduced Density Matrices Approximately N-representable

    Get PDF
    In methods like geminal-based approaches or coupled cluster that are solved using the projected Schr\"odinger equation, direct computation of the 2-electron reduced density matrix (2-RDM) is impractical and one falls back to a 2-RDM based on response theory. However, the 2-RDMs from response theory are not NN-representable. That is, the response 2-RDM does not correspond to an actual physical NN-electron wave function. We present a new algorithm for making these non-NN-representable 2-RDMs approximately NN-representable, i.e. it has the right symmetry and normalization and it fulfills the PP-, QQ- and GG-conditions. Next to an algorithm which can be applied to any 2-RDM, we have also developed a 2-RDM optimization procedure specifically for seniority-zero 2-RDMs. We aim to find the 2-RDM with the right properties that is the closest (in the sense of the Frobenius norm) to the non-N-representable 2-RDM by minimizing the square norm of the difference between the initial 2-RDM and the targeted 2-RDM under the constraint that the trace is normalized and the 2-RDM, QQ- and GG-matrices are positive semidefinite, i.e. their eigenvalues are non-negative. Our method is suitable for fixing non-N-respresentable 2-RDMs which are close to being N-representable. Through the N-representability optimization algorithm we add a small correction to the initial 2-RDM such that it fulfills the most important N-representability conditions.Comment: 13 pages, 8 figure

    Snapshots of a molecular swivel in action

    Get PDF
    Members of the serine family of site-specific recombinases exchange DNA strands via 180° rotation about a central protein-protein interface. Modeling of this process has been hampered by the lack of structures in more than one rotational state for any individual serine recombinase. Here we report crystal structures of the catalytic domains of four constitutively active mutants of the serine recombinase Sin, providing snapshots of rotational states not previously visualized for Sin, including two seen in the same crystal. Normal mode analysis predicted that each tetramer's lowest frequency mode (i.e. most accessible large-scale motion) mimics rotation: two protomers rotate as a pair with respect to the other two. Our analyses also suggest that rotation is not a rigid body movement around a single symmetry axis but instead uses multiple pivot points and entails internal motions within each subunit

    Inversion asymmetry effects in modulation-doped Cd1-xMnxTe quantum wells

    Get PDF
    We report a striking in-plane anisotropy of the spin-flip Raman signals observed for dilute magnetic Cd1−xMnxTe quantum wells containing a two-dimensional electron gas. The effect depends upon electron concentration, which can be varied within a single sample via secondary above-barrier illumination. The experimental results are described in a simple, single-electron picture by a model of the conduction band Hamiltonian that includes contributions from Dresselhaus, Rashba, and Zeeman terms

    Genetic Diversity and Association Studies in US Hispanic/Latino Populations: Applications in the Hispanic Community Health Study/Study of Latinos

    Get PDF
    US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a “genetic-analysis group” variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness

    Genome-wide Association Study of Platelet Count Identifies Ancestry-Specific Loci in Hispanic/Latino Americans

    Get PDF
    Platelets play an essential role in hemostasis and thrombosis. We performed a genome-wide association study of platelet count in 12,491 participants of the Hispanic Community Health Study/Study of Latinos by using a mixed-model method that accounts for admixture and family relationships. We discovered and replicated associations with five genes (ACTN1, ETV7, GABBR1-MOG, MEF2C, and ZBTB9-BAK1). Our strongest association was with Amerindian-specific variant rs117672662 (p value = 1.16 × 10−28) in ACTN1, a gene implicated in congenital macrothrombocytopenia. rs117672662 exhibited allelic differences in transcriptional activity and protein binding in hematopoietic cells. Our results underscore the value of diverse populations to extend insights into the allelic architecture of complex traits

    Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure

    Get PDF
    Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies (GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report results from a GWAS meta-analysis of HF comprising 47,309 cases and 930,014 controls. Twelve independent variants at 11 genomic loci are associated with HF, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function, suggesting shared genetic aetiology. Functional analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homoeostasis (BAG3), and cellular senescence (CDKN1A). Mendelian randomisation analysis supports causal roles for several HF risk factors, and demonstrates CAD-independent effects for atrial fibrillation, body mass index, and hypertension. These findings extend our knowledge of the pathways underlying HF and may inform new therapeutic strategies
    corecore